Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3057

[Hive] Extra bytes detected at the end of the row!

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Not A Problem
    • 1.0.2
    • None
    • SQL
    • None

    Description

      run a HiveQL failed, got INFO as below:

      14/08/15 10:19:55 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Input split: hdfs://yh/user/ode/warehouse/dws.db/dws_itm_query_effect_d/dt=20140813/000000_0:402653184+67108864
      14/08/15 10:20:32 WARN org.apache.hadoop.hive.serde2.lazy.LazyStruct.parse(LazyStruct.java:160): Extra bytes detected at the end of the row! Ignoring similar problems.
      14/08/15 10:21:08 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Executor is trying to kill task 6
      14/08/15 10:21:08 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Driver commanded a shutdown
      14/08/15 10:21:08 INFO akka.event.slf4j.Slf4jLogger$$anonfun$receive$1$$anonfun$applyOrElse$3.apply$mcV$sp(Slf4jLogger.scala:74): Shutting down remote daemon.
      14/08/15 10:21:08 INFO akka.event.slf4j.Slf4jLogger$$anonfun$receive$1$$anonfun$applyOrElse$3.apply$mcV$sp(Slf4jLogger.scala:74): Remote daemon shut down; proceeding with flushing remote transports.
      14/08/15 10:21:08 INFO akka.event.slf4j.Slf4jLogger$$anonfun$receive$1$$anonfun$applyOrElse$3.apply$mcV$sp(Slf4jLogger.scala:74): Remoting shut down
      14/08/15 10:21:08 INFO akka.event.slf4j.Slf4jLogger$$anonfun$receive$1$$anonfun$applyOrElse$3.apply$mcV$sp(Slf4jLogger.scala:74): Remoting shut down.
      14/08/15 10:21:08 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Executor killed task 6

      seems like that there are Chines garbled and skipped normally, but subsequently got ERROR as below:

      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Cancelling stage 1
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Stage 1 was cancelled
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Cancelling stage 2
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Removed TaskSet 2.0, whose tasks have all completed, from pool default
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Stage 2 was cancelled
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Failed to run runJob at InsertIntoHiveTable.scala:158
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): finishApplicationMaster with FAILED
      Exception in thread "Thread-2" java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:199)
      Caused by: org.apache.spark.SparkException: Job 0 cancelled because Stage 1 was cancelled
      at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1049)
      at org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:1014)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleStageCancellation$1.apply$mcVI$sp(DAGScheduler.scala:1002)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleStageCancellation$1.apply(DAGScheduler.scala:1001)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleStageCancellation$1.apply(DAGScheduler.scala:1001)
      at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
      at scala.collection.mutable.ArrayOps$ofInt.foreach(ArrayOps.scala:156)
      at org.apache.spark.scheduler.DAGScheduler.handleStageCancellation(DAGScheduler.scala:1001)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
      at akka.actor.ActorCell.invoke(ActorCell.scala:456)
      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
      at akka.dispatch.Mailbox.run(Mailbox.scala:219)
      at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
      at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Invoking sc stop from shutdown hook
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): AppMaster received a signal.
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Deleting staging directory .sparkStaging/application_1407741429810_7604
      14/08/15 10:20:48 ERROR org.apache.spark.Logging$class.logError(Logging.scala:95): Listener EventLoggingListener threw an exception
      java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.spark.util.FileLogger$$anonfun$flush$2.apply(FileLogger.scala:166)
      at org.apache.spark.util.FileLogger$$anonfun$flush$2.apply(FileLogger.scala:166)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.util.FileLogger.flush(FileLogger.scala:166)
      at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:87)
      at org.apache.spark.scheduler.EventLoggingListener.onJobEnd(EventLoggingListener.scala:112)
      at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$4.apply(SparkListenerBus.scala:52)
      at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$4.apply(SparkListenerBus.scala:52)
      at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81)
      at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:79)
      at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      at org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:79)
      at org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:52)
      at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
      at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1160)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46)
      Caused by: java.io.IOException: Filesystem closed
      at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565)
      at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1544)
      at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1526)
      at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:123)
      ... 27 more
      14/08/15 10:20:48 ERROR org.apache.spark.Logging$class.logError(Logging.scala:95): Listener EventLoggingListener threw an exception
      java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.spark.util.FileLogger$$anonfun$flush$2.apply(FileLogger.scala:166)
      at org.apache.spark.util.FileLogger$$anonfun$flush$2.apply(FileLogger.scala:166)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.util.FileLogger.flush(FileLogger.scala:166)
      at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:87)
      at org.apache.spark.scheduler.EventLoggingListener.onApplicationEnd(EventLoggingListener.scala:122)
      at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$13.apply(SparkListenerBus.scala:70)
      at org.apache.spark.scheduler.SparkListenerBus$$anonfun$postToAll$13.apply(SparkListenerBus.scala:70)
      at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:81)
      at org.apache.spark.scheduler.SparkListenerBus$$anonfun$foreachListener$1.apply(SparkListenerBus.scala:79)
      at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      at org.apache.spark.scheduler.SparkListenerBus$class.foreachListener(SparkListenerBus.scala:79)
      at org.apache.spark.scheduler.SparkListenerBus$class.postToAll(SparkListenerBus.scala:70)
      at org.apache.spark.scheduler.LiveListenerBus.postToAll(LiveListenerBus.scala:32)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:56)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:56)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply(LiveListenerBus.scala:47)
      at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1160)
      at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:46)
      Caused by: java.io.IOException: Filesystem closed
      at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565)
      at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1544)
      at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1526)
      at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:123)
      ... 27 more
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Stopped Spark web UI at http://I147-41:33194
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Stopping DAGScheduler
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Shutting down all executors
      14/08/15 10:20:48 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Asking each executor to shut down
      14/08/15 10:20:49 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): MapOutputTrackerActor stopped!
      14/08/15 10:20:50 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Selector thread was interrupted!
      14/08/15 10:20:50 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): ConnectionManager stopped
      14/08/15 10:20:50 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): MemoryStore cleared
      14/08/15 10:20:50 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): BlockManager stopped
      14/08/15 10:20:50 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): Stopping BlockManagerMaster
      14/08/15 10:20:50 INFO org.apache.spark.Logging$class.logInfo(Logging.scala:58): BlockManagerMaster stopped
      Exception in thread "Thread-57" java.io.IOException: Filesystem closed
      at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:565)
      at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1247)
      at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1212)
      at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:276)
      at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:265)
      at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:82)
      at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
      at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:867)
      at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:766)
      at org.apache.spark.util.FileLogger.createWriter(FileLogger.scala:125)
      at org.apache.spark.util.FileLogger.newFile(FileLogger.scala:189)
      at org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:129)
      at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:992)
      at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:992)
      at scala.Option.foreach(Option.scala:236)
      at org.apache.spark.SparkContext.stop(SparkContext.scala:992)
      at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:461)

      Attachments

        Activity

          People

            Unassigned Unassigned
            pengyanhong pengyanhong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: